Epidemiology of Enteroaggregative, Enteropathogenic, and Shiga Toxin–Producing Escherichia coli Among Children Aged <5 Years in 3 Countries in Africa, 2015–2018: Vaccine Impact on Diarrhea in Africa (VIDA) Study

Abstract Background To address knowledge gaps regarding diarrheagenic Escherichia coli (DEC) in Africa, we assessed the clinical and epidemiological features of enteroaggregative E. coli (EAEC), enteropathogenic E. coli (EPEC), and Shiga toxin–producing E. coli (STEC) positive children with moderate-to-severe diarrhea (MSD) in Mali, The Gambia, and Kenya. Methods Between May 2015 and July 2018, children aged 0–59 months with medically attended MSD and matched controls without diarrhea were enrolled. Stools were tested conventionally using culture and multiplex polymerase chain reaction (PCR), and by quantitative PCR (qPCR). We assessed DEC detection by site, age, clinical characteristics, and enteric coinfection. Results Among 4840 children with MSD and 6213 matched controls enrolled, 4836 cases and 1 control per case were tested using qPCR. Of the DEC detected with TAC, 61.1% were EAEC, 25.3% atypical EPEC (aEPEC), 22.4% typical EPEC (tEPEC), and 7.2% STEC. Detection was higher in controls than in MSD cases for EAEC (63.9% vs 58.3%, P < .01), aEPEC (27.3% vs 23.3%, P < .01), and STEC (9.3% vs 5.1%, P < .01). EAEC and tEPEC were more frequent in children aged <23 months, aEPEC was similar across age strata, and STEC increased with age. No association between nutritional status at follow-up and DEC pathotypes was found. DEC coinfection with Shigella/enteroinvasive E. coli was more common among cases (P < .01). Conclusions No significant association was detected between EAEC, tEPEC, aEPEC, or STEC and MSD using either conventional assay or TAC. Genomic analysis may provide a better definition of the virulence factors associated with diarrheal disease.

While residing in the human intestine as commensal flora, strains of Escherichia coli have acquired the ability to cause diarrheal disease via the horizontal transfer of virulence genes from cohabitating intestinal bacteria [1]. Over time, transformed strains with a survival advantage have propagated within different E. coli phylogenetic lineages to become pathotypes that produce a broad spectrum of diarrheal diseases, sometimes with severe consequences [2]. Known as diarrheagenic E. coli (DEC), 5 distinct categories have been identified: enteroaggregative E. coli (EAEC), enteropathogenic E. coli (EPEC), enterotoxigenic E. coli (ETEC), Shiga toxin-producing E. coli (STEC)/ enterohemorrhagic E. coli, and enteroinvasive E. coli (EIEC). The pathogenicity of a sixth category, diffusely adherent E. coli, remains uncertain [1]. EPEC is further classified as typical (tEPEC) and atypical (aEPEC) based on the detection of bundleforming pilus (bfpA) and/or E. coli attaching and effacing (eae) virulence genes, as described below [1]. In general, strains within each pathotype share virulence factors that produce similar clinical manifestations and pathologic features, often with characteristic host predilections, transmission dynamics, epidemiology, and disease burden. Together, they are among the leading causes of diarrhea-associated morbidity and mortality in children aged <5 years in low-and middle-income countries (LMICs) [3,4].
Traditionally, laboratory diagnosis of DEC required isolation of E. coli colonies from stool culture followed by detection of distinguishing pathotype-associated features using either phenotypic assays such as microscopy, serology, and antigen detection or molecular tests such as gene probe and polymerase chain reaction (PCR) [5]. The advent of molecular panels such as the quantitative PCR (qPCR)-based TaqMan Array card (TAC) introduced a tool with many advantages for performing diarrheal disease research, including high sensitivity, rapid throughput, and the ability to contemporaneously test stool samples directly for a broad array of pathogens. In addition, qPCR allows determination of pathogen-specific cycle threshold (Ct) values that can distinguish cases from controls under the assumption that symptomatic infections have higher pathogen burdens [6].
The Global Enteric Multicenter Study (GEMS), a casecontrol study of medically attended moderate-to-severe diarrhea (MSD) among children aged <5 years living in LMICs in Asia and Africa, identified DEC using conventional multiplex PCR to test E. coli isolated from stool samples [3]. ETEC that produced heat-stable toxin with or without heat-labile toxin was found to be a major cause of MSD. STEC was the least frequent DEC in contrast to its predominance in high-resource settings where it causes diarrhea associated with hemorrhagic colitis and hemolytic uremic syndrome [7]. Whereas tEPEC was significantly associated with MSD in children aged <2 years in Kenya and death in infants aged <1 year, aEPEC was not associated with MSD or death at any site [3,8]. GEMS reported a high prevalence of EAEC among symptomatic and asymptomatic children, with inconsistent associations with diarrhea [3,9,10].
A retrospective etiological reanalysis of a subset of samples from cases and controls who participated in GEMS using TAC qPCR demonstrated an increase in the total pathogen-specific attributable diarrheal burden from 51.5% using culture plus multiplex PCR to 89.3% using TAC, suggesting that conventional methods underestimate the prevalence of many pathogens [11]. To update our understanding of the epidemiology of DEC among children with and without diarrhea, with a focus on settings where rotavirus vaccine introduction may have altered the landscape of enteric pathogens, we examined data from the Vaccine Impact on Diarrhea in Africa (VIDA) study, a follow-on study to GEMS at 3 sites in sub-Saharan Africa [12]. In VIDA, stool samples from children with MSD and their controls were contemporaneously tested using both conventional multiplex PCR and TAC, allowing for a comparison of the associations between DEC pathotypes and MSD using both diagnostic approaches. In addition, we determined whether DEC coinfection with other enteric pathogens increased the severity of disease, as has been reported elsewhere [13]. We specifically focused on EPEC, EAEC, and STEC, whose role in diarrheal disease has not been clearly elucidated in Africa. EIEC cannot be distinguished from Shigella using TAC and so was not included in this analysis. Because of its considerable burden, ETEC will be the subject of a separate article.

Study Design and Participants
Three sites (Bamako, Mali; Basse and Bansang, The Gambia; and Siaya County, Kenya) were selected from African countries with high childhood mortality [3] that had introduced rotavirus vaccine. VIDA participants resided within a demographic surveillance system (DSS) catchment area.
For 36 months at each site between May 2015 and July 2018, children in 3 age strata (0-11 months, 12-23 months, and 24-59 months) who sought care at health facilities serving the DSS were assessed for MSD as previously described [3]. MSD was defined as ≥3 loose stools within the last 24 hours [14] plus ≥1 of the following: sunken eyes, skin tenting, dysentery, required intravenous rehydration, or hospitalization within 7 days of diarrhea onset. Within 2 weeks of enrolling each MSD case, 1-3 diarrhea-free controls matched by age, sex, and neighborhood were randomly selected from the site's DSS database and enrolled at home, as described [3]. A followup home visit was made to every enrolled child 50-90 days after enrollment (average 60).
Demographic, epidemiologic, and clinical data were collected, and anthropometry was performed for all cases and controls at the enrollment and follow-up visits as described [15]. To determine the duration of diarrhea, caregivers recorded the occurrence of diarrhea daily for 14 days after enrollment using a simple pictorial memory aid [16] that was reviewed with the caretaker and collected at the follow-up visit. The total duration of diarrhea was defined as the days with diarrhea prior to enrollment plus the 14 days post-enrollment recorded on the memory aid (Supplementary Figure 1).

Sample Collection and Laboratory Analysis
Each case and control provided at least 3 g of fresh whole stool that was placed in cold storage within 1 hour of production. The whole stool was swabbed or a rectal swab was obtained if antibiotics were to be administered [17]. Swabs for culture of E. coli were placed in Cary-Blair transport media (Oxoid/ REMEL, Inc, Lenexa, KS) within 6 hours of production and transported to the microbiology laboratory.

Diarrheagenic E. coli Detection Using Conventional Culture Plus PCR Methods
Immediately upon arrival at the laboratory, the swab was inoculated on culture media and incubated aerobically at 35°C-36° C for 18-24 hours. Three colonies of E. coli from every stool were pooled for PCR analysis. Primers targeting specific genes for detection of EPEC, EAEC, and STEC were used as previously described [17,18]. In brief, typical and atypical EPEC were identified by primers targeting bfpA and eae genes and classified as tEPEC (bfpA detected with or without eae) and as aEPEC (eae detected without bfpA, stx 1 , or stx 2 ). EAEC presence was defined by detection of aaiC and/or aatA. STEC was defined by amplification of stx 1 and/or stx 2 (regardless of eae) without bfpA.

Diarrheagenic E. coli Pathotype Analysis Using TAC qPCR
Stool specimens were aliquoted and stored at −80°C until extraction. Total nucleic acid (TNA) was extracted from 200 mg of the whole stool specimen using the QIAamp Fast DNA stool mini kit (Qiagen, Hilden, Germany) with a modification that involved addition of glass beads to weighed stool before the addition of lysis buffer, then bead-beating to obtain a homogeneous mixture for TNA extraction [19]. The TNAs were tested on a real-time qPCR platform using TAC (Thermo Fisher, Carlsbad, CA), which amplifies nucleic acid for 30 enteropathogens plus multiple genotypes of several pathogens. The amplification curves were analyzed on Vii7 software (version 1.2.4) [11]. The primer (vide supra) is further described by Liu et al [19]. Samples with a primer target quantification Ct <35 were considered positive for that pathogen.

Data Analyses
First, we compared the proportion of cases with detection by conventional and TAC methods for each pathogen in case or control specimens. The remainder of the analyses only used the TAC results. We compared the Ct values in cases and controls using a Wilcoxon rank sum test to assess the relative pathogen quantities. We calculated the positivity proportions among cases and controls by demographic characteristics, site, clinical characteristics, severity, and rates of coinfection with other enteric pathogens detected using TAC, including adenovirus 40/41, Aeromonas spp., astrovirus, toxigenic Bacillus fragilis, Campylobacter spp., Cryptosporidium spp., Enterocytozoon bieneusi, Helicobacter pylori, norovirus GI, norovirus GII, Plesiomonas spp., rotavirus, Salmonella spp., sapovirus, Shigella spp./EIEC, and heat stable-or heat labile-producing enterotoxigenic E. coli. χ 2 tests were used to compare categorical variables; a P value <.05 was considered statistically significant.
We analyzed MSD cases to determine whether EPEC, EAEC, or STEC was associated with stunting, as measured at the enrollment and follow-up visit as described in Supplementary  Table 1 and [15]. Stunting was defined as a height/ length-for-age z score >2 standard deviations below the World Health Organization child growth standard median [20]. We initially compared stunting at follow-up between positive vs negative MSD cases using a χ 2 test. Thereafter, we used propensity score matching to limit potential selection bias from the design of the original case/control study (Supplementary Table 1). We report the average treatment effect for the treated, that is, the difference in expected growth if those who were infected actually had not been infected with associated 95% confidence intervals and P values estimated via bootstrapping.
Statistical significance was defined as a P value < .05, and all analyses were performed using Stata/SE version 16. Written informed consent was obtained from the parent or primary caretaker of each child who met eligibility criteria before any research activities were performed.

RESULTS
Collectively, 4840 MSD cases and 6213 matched controls were enrolled across the 3 sites. The characteristics of cases and controls enrolled in the VIDA study are described elsewhere [12].
When case and control children evaluated using both TAC and conventional assays (n = 9672) were compared (Figure 1), the proportion who had pathogens detected using TAC was more than 2-fold higher than the proportion detected using conventional methods. Moreover, nearly all individual DEC pathotypes that were positive by conventional methods were also positive by TAC. Specifically, the proportion positive by conventional methods who had negative TAC results was only 4.4%, 1.3%, and 0.1% for EAEC, aEPEC, and STEC, respectively, among cases and 3.8%, 2.4%, and 0.1%, respectively, among controls (Supplementary Table 2). The exception was tEPEC, for which 5.7% of cases and 8.3% of controls were positive by conventional methods but negative by TAC. Nonetheless, the increased ability to detect DEC was harmonious across cases and controls, so that the relative isolation rates among cases and controls were similar regardless of method (Table 1). Likewise, the distribution of the Ct values for specific DEC pathotypes was similar in both cases and controls with the exception of tEPEC, which had significantly lower Ct values in cases than in controls (P = .0001; Supplementary Figure 2, Supplementary Table 3).
The distribution of EAEC, EPEC, and STEC pathotypes overall and by site, age, and sex according to case vs control status is presented in Table 2. EAEC was the most common pathotype detected (61.1%) followed by EPEC (aEPEC 25.3% and tEPEC 22.4%), with STEC being the least common (7.2%). The proportions of EAEC, aEPEC, and STEC were significantly lower in cases than in controls (58.3% vs 63.9%, 23.3% vs 27.3%, and 5.1% vs 9.3%, respectively; all P <.001). In contrast, tEPEC was similar in both cases and controls (22.3% vs 22.5%, P > .05). When eae was included in the definition of STEC, 3.3% of cases and 7.0% of controls met the TAC definition of positive, so the lack of association with MSD was unchanged ( Table 2, Supplementary  Table 4). The frequencies of stx 1 and stx 2 were similar within the case and control groups, and both genotypes were similar or less common in cases compared with controls. Among both cases and controls, the prevalence of EAEC and tEPEC declined with age, while the proportion of children with aEPEC was similar across ages and the prevalence of STEC increased with age. STEC frequency appeared to peak in MSD children aged 12-30 months (Figure 2).
Caretaker report of the presence of ruminant animals (cows, goats, or sheep) in the child's compound was explored as a potential source of infection with STEC. STEC positivity in the child's stool was significantly associated with ruminant exposure (P < .01) for both cases and controls across all age groups. More than 73% of STEC-positive case and control children had a ruminant animal present in their domiciles compared with 60% for STEC-negative cases and controls (Table 3).
Of the 4840 MSD cases enrolled in VIDA, 223 had missing length/height measurements and 14 had implausible values, reducing the analytical dataset to 4603 children with MSD for the stunting analysis. Although crude unadjusted analyses showed   Tables 6 and 7). DEC pathotypes were found more often in MSD episodes in which there was a mixed infection with another enteric pathogen than when they were the sole pathogens (Supplementary Table 8). Mixed infections with each DEC pathotype were significantly more common in cases than in controls for all pathotypes (EAEC, 96.0% vs 94.2%, P = .0018; tEPEC, 98.1% vs 97.5%, P = .0069; and aEPEC, 97.6% vs 94.6%, P = .0003). When trends were apparent, symptoms tended to be more marked when the DEC pathotype was part of a coinfection ( Table 4). All deaths were reported in cases with coinfection. The frequency of coinfections was further evaluated to determine the dominant pair of a DEC pathotype and any of the most common enteric pathogens in cases and controls ( Figure 3). In this analysis, coinfection with Shigella/EIEC was predominant and was consistently higher in cases (38%-48%) than in controls (28%-30%, P < .01). Notably, bloody diarrhea was not seen in any of the 6 episodes with STEC alone, 35 (31.3%) of the episodes with STEC plus Shigella/EIEC, and 26 (50%) of the episodes with Shigella/EIEC as the sole pathogen. Bloody diarrhea was observed more often when any of the other pathotypes was accompanied by Shigella vs those without concomitant Shigella, as follows: EAEC 289 (27.7%) vs 8 (7.1%), tEPEC 116 (27.1%) vs 1 (5%), and tEPEC 114 (27.4%) vs 2 (7.4%).

DISCUSSION
Testing the stool samples in VIDA using both conventional and TAC assays provided an opportunity to directly determine whether TAC enhanced the ability to detect pathogenic DEC strains among children with MSD participating in a large, controlled study in settings with high diarrheal disease burden. Our findings indicate that although the 3 DEC pathotypes evaluated were found far more commonly using TAC than the conventional assays, the fold-increase was similar among cases and controls. Neither method identified a significant association between a DEC pathotype and MSD. In fact, EAEC, aEPEC, and STEC were detected using TAC significantly more often in controls compared with cases, while the difference for tEPEC was insignificant. These findings also highlight the importance of using controls in studies of diarrhea etiology.
Numerous controlled studies have indicated that tEPEC is an important cause of community-acquired diarrhea in children from LMICs, particularly among infants aged ≤12 months living in Latin America [21][22][23][24][25][26]. Therefore, it was surprising that we did not detect an association between tEPEC and MSD at any site or in any age group. Particularly unexpected was the finding among infants at the Kenyan site, where a few years previously an association with MSD had been observed in GEMS using the same TAC qPCR assay and general geographic area [8,12]. Nonetheless, other controlled studies have also failed to show a significant difference in the frequency of tEPEC in cases and controls, and some have suggested that the prevalence of tEPEC may be declining [10,[27][28][29], thus creating an incongruent picture of the role of tEPEC as a cause of diarrhea in LMICs.
Conflicting findings have also been reported concerning the association between MSD and the remaining DEC, aEPEC, EAEC, and STEC. The lack of a relationship that we observed in VIDA corroborates negative findings from recent studies that included Etiology, Risk Factors, and Interaction of Enteric Infections and Malnutrition and the Consequences for Child Health and Development project (MAL-ED) [10] and others [3,10,25,30]. Among the 7 sites and 3 age strata in GEMS [23], EAEC was only associated with endemic MSD in children aged 12-23 months from Bangladesh [31][32][33]. In contrast, other studies have found an association with endemic diarrhea in children, especially prolonged episodes, for both aEPEC [23,[31][32][33] and EAEC [26,34]. Although we have identified a reservoir among humans and ruminants, as seen in high-income countries [5], STEC was the least common pathotype, which is consistent with previous observations in LMICs [35,36]. Moreover, STEC was not associated with MSD or hemorrhagic colitis, presentations seen in high-income settings, using stx 1 and/or stx 2 with or without inclusion of eae as a marker. The conflicting results for tEPEC, aEPEC, EAEC, and STEC raise questions about whether the gene targets currently used to identify these pathogens include some strains capable of causing diarrhea, but that additional factors must be present for full expression of disease.
Indeed, recent investigations have begun to uncover potential factors that may help to explain the drivers of DEC pathogenicity [37]. Genomic analyses of EPEC and EAEC have demonstrated considerable diversity in both core virulence loci and virulence plasmids, even within the same phylogenomic lineage, suggesting that these pathotypes have continued to acquire genetic changes since their initial acquisition of their defining features [38,39]. New targets associated with disease severity that have not been included in current assays have been found [39,40]. Expression of virulence factors is under the control of complex regulatory mechanisms derived from the host (eg, age, nutritional status [41], and genetic factors [37]), the organism, and environmental milieu such as intestinal microbiota, nutrients, and oxygen tension [39,41,42].
Correlations of these findings with isolate-specific clinical manifestations are being explored [43].
There is evidence to suggest that DEC induce intestinal inflammation that can lead to growth and nutritional faltering even in the absence of diarrheal disease [8,25,[44][45][46][47][48][49][50], particularly during the first 2 years of life. In addition, both tEPEC and EAEC were associated with an increased risk of death in GEMS within 2-3 months after onset of the MSD episode [51]. Despite these observations, no DEC pathotype in VIDA was significantly associated with stunting among cases reexamined 2-3 months after enrollment. Cases with DEC were not more likely to die or to exhibit more pronounced illness when fever, blood in stool, duration of diarrhea, and vomiting were examined. DEC pathotypes were identified much more often in mixed infections with other enteric pathogens than as the sole pathogen. Compared with sole infection, coinfection with other pathogens did not significantly enhance the symptomatology in MSD cases as has been reported elsewhere, with the exception that Shigella increased the occurrence of blood in stools for all pathotypes [13]. Of note, coinfection with Shigella was consistently high, especially among MSD cases.
A significant limitation in this study is that because TAC was performed directly on stool samples, DEC that meet criteria for multiple virulence targets may be identifying the genes on different microorganisms. Another limitation is that the high frequency of asymptomatic carriage and coinfections make it challenging to attribute clinical findings to DEC, and associations with MSD disease may be obscured.
In conclusion, we did not identify a role for EAEC, EPEC, or STEC in causing MSD at sites in sub-Saharan Africa. Given the diversity of the DEC strains, it is likely that particular strains or subtypes may cause disease. Future genomic analysis and investigations into the factors that regulate expression of virulence factors during diarrhea will be necessary to gain insight into the role of DEC in diarrheal disease in the African setting and elsewhere. tEPEC Only a n = 20 tEPEC + Any a n = 1023 aEPEC Only a n = 27 aEPEC + Any a n = 1070 STEC Only a n = 6 STEC + Any a n = 235

Supplementary Data
Supplementary materials are available at Clinical Infectious Diseases online. Consisting of data provided by the authors to benefit the reader, the posted materials are not copyedited and are the sole responsibility of the authors, so questions or comments should be addressed to the corresponding author.